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Abstract 



J^^ ■ This paper considers a combination of intelligent repositioning decisions and dy- 

^O ■ namic pricing for the improved operation of shared mobility systems. The approach 

(/3 , is applied to London's Barclays Cycle Hire scheme, which the authors have simulated 

based on historical data. Using model-based predictive control principles, dynamically 
varying rewards are computed and offered to customers carrying out journeys. The aim 

CN ■ 

►^ ■ is to encourage them to park bicycles at nearby under-used stations, thereby reducing 

0^ ■ the expected cost of repositioning them using dedicated staff. In parallel, the routes 

JiL , that repositioning staff should take are periodically recomputed using a model-based 

fT^ ' heuristic. It is shown that it is possible to trade off reward payouts to customers against 

^+ ■ the cost of hiring staff to reposition bicycles, in order to minimize operating costs for 

>— ^ ■ a given desired service level. 

cn 



1 Introduction 



H , Public Bicycle Sharing (PBS) schemes offer the rental of bicycles as a means of public 

transportation in urban areas. They allow registered users to pick up a bicycle from one of 
many docking stations throughout the entire city, without any prior notice. The bicycle is 
returned to another station, after which the user's intended destination is usually reached 
on foot. Short journeys are encouraged by charging users only a small fee for a short rental 
period (typically less than one hour), but then ramping up the cost significantly for longer 
use. 

Such schemes have been introduced in major cities as an alternative to often slow and 
crowded mass transportation. Many have grown considerably in size in recent years IJJlZj, 
and are becoming a major component of their cities' public transportation systems. In 2008, 
for example, 120,000 daily journeys were being made using shared bicycles in Paris [3]. 



Most PBS schemes are still unable to recoup their full operational and investment costs 
solely from customer fees. According to j4], capital costs can be up to $4,500 per bicycle, 
and annual operational costs up to $1,700 per bicycle. Sometimes additional revenues from 
advertising can be used to mitigate this cost gap. However, in almost all cases additional 
funding from public sources is required [Tl[5]. 

One of the major contributors to operational costs is the need to operate staffed trucks 
for manual relocation of bicycles, in order to balance the difference between supply and 
demand at various stations. If this effort were not made, the arrival and departure of 
customers would cause many stations to run full or empty, and the customer service rate 
would drop below acceptable levels [6j[7]. Since this redistribution of bicycles entails costs, 
a trade-off for the desired service level needs to be made. 

The goal of this paper is to show how the system performance could be optimized by 
trading off two complementary methods. Firstly, we devise an algorithm to optimize the 
dynamic route-planning of multiple trucks for bicycle relocation. Secondly, on top of this 
manual repositioning, we propose a scheme that offers users price incentives based on the 
current and predicted state of the system, in order to encourage them to change the endpoint 
of their journeys. These incentives are set to shift bicycle drop-offs away from stations that 
are overfilled, and towards nearby stations that may have empty spaces. The price incentives 
are independent of the usual rental fees, which we assume to be a sunk cost for the user. 

This paper's main contributions can be summarized as follows: 

1. A tailored routing algorithm that plans how trucks will be used to redistribute bikes 
amongst stations. Redistribution is performed in the dynamic setting, i.e. while the 
system is in operation. The heuristic chooses the actions of multiple trucks, with the 
aim of enabling as many extra journeys as possible to take place. 

2. A dynamic incentives scheme where customers are encouraged to change their target 
station in exchange for a payment. Changes to journey length may be inconvenient, 
and we assume customers accept or reject such incentives based on the value of their 
time and the payment offered. 

Both the truck routes and the price incentives are recomputed online at periodic time in- 
stances. For both components, a predictive model of the expected near-future evolution 
of the system state is used to optimize their actions over a finite, receding horizon. The 
optimization goal is to maximize the number of additional journeys enabled via reposition- 
ing, taking into account available resources and cost trade-offs. At each re-optimization 
step, up-to-date information on the current state of the system is used to plan all future 
operational decisions. This is shown schematically in Fig. [TJ 

We evaluate the trade-off between these two methods using a Monte Carlo model of 
the London Cycle Hire PBS scheme, constructed from detailed historical usage information. 
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Figure 1: Schematic of the onhne optimization scheme presented in this paper. In step 2, 
truck routes are planned based on a model of how the bike movements will evolve from the 
current state, and if necessary new orders are issued. In step 3, the current system state 
and the future truck actions are taken into account, and new price incentives for users are 
computed, based on a trade-off between payouts and system performance. 

Our results suggest that service level improvements may be attained using price incentives 
alone, and that increases in either the customer payouts or the number of repositioning 
trucks deliver diminishing returns to service levels. Unsurprisingly, higher service levels can 
be reached on weekends in comparison to weekdays. 

The paper is organized as follows. In Section [2] we explain how a model of the London 
PBS scheme was derived using historical data. In Section [3] we develop a metric for the 
utility of repositioning actions based on the expected ability to serve additional future 
customers. The results are used in Section|4]to develop a heuristic for determining the routes 
of repositioning trucks. In Section [5] we develop a model- based controller for computing the 
price incentives offered to customers. The two approaches for repositioning are compared 
using a Monte Carlo simulation framework in Section [51 Section [7] draws conclusions on the 
performance of the scheme. 



2 System model 

2.1 Historical data 

The PBS system model used in this paper is based on London's Barclays Cycle Hire scheme. 
For modeling, we used three data sets made publicly available by the Transport for London 
authority 111 

1. 1.42 million rides spanning a period of 97 days (examples in Table [1]), 

2. Size and location of 354 stations actively used during the recorded period (examples 
in Table in, 



^http: //www. tfl.gov.uk/buslnessandpartners/syndicatlon/default.aspx I 



Table 1: Ride information samples 



bike-id start (date, station-id) 



end (date, station-id) 



3340 {2010-07-30 06:00:00, 47} 

3870 {2010-07-30 06:00:00, 234} 

1627 {2010-07-30 06:01:00, 149} 

1695 {2010-07-30 06:02:00, 152} 



{2010-07-30 06:22:00, 47} 
{2010-07-30 06:14:00, 203} 
{2010-07-30 06:29:00, 293} 
{2010-07-30 06:06:00, 324} 



Table 2: Station information samples 



id 



name 



position (lat. Ion) size 



1 River St, Clerkenwell {51.5291, -0.1099} 18 

2 Phillimore Gardens, Kensington {51.4996, -0.1975} 34 

3 Christopher St, Liverpool St {51.5212, -0.0846} 33 

4 St. Chad's Street, King's Cross {51.5300, -0.1200} 22 

3. An initial station fill level recorded during nighttime when all bicycles were docked. 
In total, we estimate that the system contained 3708 bikes at the time for which 
historical data is available. 
Analysis of the historical journeys reveals regular daily flow patterns, with a substantial 
difference between weekdays and weekends. Journeys are allowed between 6am and mid- 
night. As expected, many customers commute to the city center in the morning and ride 
back to the outer districts in the late afternoon. This pattern, absent on weekends, causes 
two spikes in daily rental activities that are illustrated in Fig. [2j 
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2.2 Model parameters 

In our model, we define a set S containing all stations s € S. Time t is assumed to 
be discrete and indexed on a one-minute level where Thist denotes all time steps of the 
observed period. We distinguish between workdays and days on weekends by the binary 
variable w € {weekday , weekend} and split every day into 72 slices k € K oi 20 minutes 
each. Time is mapped to day-type and timeslice using w{t) and fc(i), respectively. All 
customer departure and arrival events are counted in matrices of dimension \S\ x \S\. The 
sum of departing customers going from station i to j in a timeslice k and on a day w is 
Dij{k, w); similarly the sum of customers who arrive at station j coming from i is Ai,j{k, w). 
The average number of such arrival events Ajj(i) and departure events AIij(t) at time t in 
the historical data can thus be expressed as 






D,,j{k{t),w{t)) 



\{t' e Thist : k{t') = k{t), w{t') = wit)}\ 



A,,^j{k(t),w{t)) 



(1) 
(2) 



\{t'eni,t:k{t')^k{t),w{t')^wit)}\- 

Here, the denominator gives the duration of the recorded history for a given timeslice and 
day- type indicated by t. 

Based on these average numbers of events happening per time step, customer departures 
are exponentially distribution with time- varying parameter Mi_j{t). This parameter fit is 
based on the following implicit assumptions: 

• 100% service rate for departures in the historical data. Potential customers who could 
not rent a bicycle due to an empty station are excluded, as they are not recorded in 
the historical data. This assumptions is justified to some degree by the considerable 
repositioning effort made by the operator of the London PBS scheme [5]. 



• Independence of customer arrival. Departure of customers vary with time and type of 
the day, but do not depend on other departures. As a caveat, this does not accurately 
model customer groups, for example tourists. 

• Effects of season, weather, events, etc. are disregarded, but could easily be included 
in a more detailed model. 

If a customer has departed at a station, the probability distribution of his destinations is 
given by their relative frequency in the historical data, as recorded in Ai_j{t). 

The total expected departure /is (t) and arrival As {t) at each station and the net change 
of fill level ?]s(i) during time step t is therefore 

Ms w = J2 ^^^Mt)^ As(t) = J2 ^^Mt)^ (3) 

ses ses 

r)s{t) = Xs{t) - ^i,{t). (4) 

In order to simulate the system, the following assumptions about the behavior of customers 
are needed, in addition to their arrival rates: 

• Customers who want to depart from a station that turns out to be empty leave without 
starting a journey. They do not wait for a bicycle to be returned, nor do they walk 
on to a neighboring station. 

• The travel time between any two stations i and j is always equal to the average travel 
time extracted from the historical data. Figure [3] depicts the historical distribution of 
travel speeds and journey durations. 

• Customers who arrive at their target station wanting to return their bicycle when 
the station is full ride on to one of the neighboring stations (chosen according to his 
perceived utility, as described in Section 1^751) . If this station is also full, a customer 
will go on to the next station, but he does not return to any station already visited. 

2.3 Customer decision model 

In order to investigate the effects of offering price incentives, a model of how the customer 
reacts is required. We assume all customers place a value on the additional time they would 
spend travelling if they were to accept an incentive. This is equivalent to penalizing a longer 
travel distance. The additional distance a customer has to travel if he changes his target 
station consists of the additional distance he has to bike, plus an additional walking distance 
to his final destination. We assume this final destination lies at the center of mass of the 
Voronoi region around each station (Figure |31). The Voronoi region is the polytope that 
contains all points closer to a given station than to any other. 

Let the center of mass of the Voronoi region around each station Si be denoted by m^ , 
and let deuci be the Euclidean distance on the map. Assuming that the walking speed is half 
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Figure 4: Voronoi partitioning with centers of mass (CoM) of the London bike sharing 
system 



the cycHng speed, the effective distance di j between two stations s^ and s, can be expressed 



as 



^i,j ^c 



ci(sj,Sj) + 2 • douci(sj,TOj) - 2 • dcuc\isi,mi). (5) 

The incentives to go to a neighboring station are offered to customers upon their arrival. 
Each customer decides whether to take an incentive by maximizing his personal benefit 
based on the incentive payout and the customer's perceived cost of additionally traveled 
distance. This implies customer rationality and makes the choice independent from the 
original pricing of journeys. For each end station s € S, the set of neighboring stations to 
which a price incentive could be offered is Ns, and Ps^n denotes the amount of money offered 
to from station s to neighbor n ^ Ng. In addition, let Ng be the set of stations s which have 
s in their neighbor set. The following model of customer reactions is used. 

1. The marginal cost of travel c for each arriving customer is drawn from a uniform 
distribution C ^ U[0, c,„ax]i where we have used Cmax = =f 20/km in our simulationso 

2. The customer selects the best offer of maximum value as 



arg maxps^„ 

neNs 



(6) 



3. If the original target station is full, i.e. the customer cannot return his bike there, he 

always chooses the best incentive to go to n* . If there is space, he takes an incentive 

only if the perceived value of the best incentive is positive, i.e. if Ps^„* — ds„-'C > 0. 

The probability 7r(s, n,ps) of an arriving customer taking an incentive to neighbor n G Ng 

for a given payout vector p depends on the distribution of perceived travel costs c. First, 



^Future work could incorporate more detailed models of how customers value their time (see [Ol llOn . In 
addition, instead of having a single distribution, one could also differentiate between customer types (e.g. 
those commuting to work and those riding during leisure times) by introducing a time-varying component. 



the offering to go to n must have the highest perceived value amongst all incentives offered 
to neighboring stations. Second, assuming the station s is not full, the perceived cost for 
traveling the additional distance must be lower than the relevant payout Ps,n- 



(7) 



TT{s,n,Ps) = P (ps^n > C ■ ds^n A Ps^n " (c • ds,n) 
> Ps,n' - (C • ds,n'),'^n' e iV^j 

3 Utility of changes in station fill level 

In this paper, two methods are considered for influencing the distribution bicycles in the 
PBS: manual repositioning (Section |4]) and price-led repositioning via incentives (Section 
[S|). However, as a basis for both algorithms, it is necessary to estimate any change in the 
stations' fill levels will bring about. Since the system is stochastic, it is not straightforward 
to assess these benefits. In this section we introduce a function that estimates the utility of 
changes in fill levels for a given station. 

Raviv and Kolka [TT] have done related work in order to determine the best fill level 
of each station in a static repositioning setting. Their approach tracks the probability of 
all possible fill levels based on a discrete approximation of the underlying continuous birth- 
death process. However, if we were to adopt this method, the dimension of the resulting 
optimization problem would significantly increase the computational complexity of the pro- 
posed approaches in Sections 0] and [S] Therefore we propose a simpler approach. 

We make the simplifying assumption that arrivals and departures are deterministic and 
given by the expected net change ris{t). Furthermore, we define the utility of changes to 
a station's fill level as the difference in the number of customers expected to be served 
successfully at that station within a long enough (but finite) time horizon. The benefit of 
any repositioning action (adding or taking away bikes at a single station) at the current 
time can then be evaluated based on this notion. 

For a given station s and starting time to, we precompute the expected future fill level 
f^ over a time horizon with t = t^, . . . ^tg + T^tn^ where Tutu = 24 hours is the look-ahead 
period considered. The expected fill level is governed by the following dynamics: 

//+! = max (0, min(//^ + Vs{t)J^,J) , (8) 

where /^ax denotes the maximum capacity of the station, and the max and min functions 
ensure that the station never becomes "more than full" or "less than empty" . The quantity 
rjsit) is the net arrival rate defined by (U)). 

Adding or taking away bikes from the station at the current time changes how many 
customers can be served later on in the time horizon, since the station will become empty 



or full at different times in the future. For a current fill level //J^ and a change in fill level 
A/ to be made at to, Algorithm [T] computes the utility u{s, t, ff, A/) by comparing the two 
cases of different initial fill level. The algorithm moves through the time horizon forward 
in time, and in each time step compares the amount of change A and A resulting from the 
system dynamics and the station size constraints. If the amount of change is lower in the 
original case, a station size constraint is being hit earlier than in the adapted case and vice 
versa. The difference between A and A is the difference in customers that could be served 
successfully in that time period. The procedure aborts if the end of the time horizon has 
been reached or if the fill level of the station becomes equal in both cases (e.g. both are full 
or empty). 

Algorithm 1 Computing the utility of repositioning 

Require: s e S, Tutn t> Relevant station and horizon length, 

1- /max ^ Maximum capacity of station s, 

2: ?7s(t) i> Expected net arrival of customers, 

3: // Vt G {to, . . . ,to + Tutu} > Precomputed fill levels in the original case according to 

®, 

4: f^^ > Starting fill level in the case with repositioning 

5; procedure REPOSITIONINGUTILITY(i, /f ) 

6: if fi = fi or t > Tutn then 
7; return 

8; end if 

9: ft%^ ^ max(0, min(// + 7],{t), f^^^)) 

10: A^|/f-/,^+i| 

11: A^|/f-/4i! 

12: return A - A + REPOSITIONINGUTILITY(t + 1, /f+i) 

13: end procedure 

Although the worst-case time complexity of Algorithm [1] is linear with respect to the 
horizon length, it would take too long to use online in the later optimization steps. Storing 
the results in a lookup table is intractable, especially if the expected future fill levels are 
treated as continuous values. However, fast computation can be achieved by constructing 
a simpler function, making use of a lookup table of dimension 2 x jS*! x Tutii, from which 
this utility can be determined. Figure [S] shows an example of such a utility function for an 
empty station, u{s, t, /f = 0, A/). For simplicity. A/ is also relaxed to be non-integral. The 
validity of this parameterisation is now proven. 

Theorem 1. For any station s (£ S at time t, there exist two fill levels f'',ft G [0,/,^ax] 




Figure 5: Example for an empty station (/ = 0), showing the utility "plateau" between the 
fill levels /* and f^. In this case maximum utility results from adding 4 to 8 bicycles at to, 
based on the expected net arrival rate over the time horizon. 

independent from the initial fill level // such that for the utility of change in fill level A/ 



dAf 



1, iffl+Af<r^ 

0, iff^<f! + Af<Jl 

-1, tffi+Af>Tf 



(9) 



The utility of no change of the station's fill level is understood to be zero, u{s, t, f^, 0) := 0. 
The (possibly empty) interval [/f'j/t] of utility is called the "plateau" of constant maximum 
utility. 

Proof. Assume the station will become full within the time horizon Tutii for initial fill level 
ff +Af. Adding 6 bicycles to the station implies that S additional expected customers who 
want to return their bicycles have to be rejected. Therefore, the utility u{s,t,f^,Af + S) 
decreases monotonically with slope -1 for any S >0, f + A/ + (5 < /max- 

w(s,t,/^A/ + <5)-u(,s,i,/|,A/)-<5 

A similar case can be made for fill levels where the station is running empty. Only, the 
utility decreases when more bicycles are removed with a slope of u equal to 1. It follows 
naturally that for a given initial fill level there are thresholds f^, ft, with f^ < ft, where the 
station first starts to run empty or full within the horizon. Within the interval [/^, ft], the 
station's capacity constraints are not hit and the utility function must therefore be constant, 
leading to a "plateau" of the type shown in Fig. \E\ D 

It can be shown that u{s,t, ft , Af) can be computed for each station using only three 
calls to Algorithm [T] Sections |4] and [5] will make use of this characterization of station fill 



10 



utilities in order to choose how trucks reposition bikes and price incentives can be offered 
to customers. 

The system dynamics ([S]) are based on deterministic net arrival of customers. This 
results in a coarser model than for example the probabilistic approach of [TT]. However, this 
simple parameterization is attractive in that it leads to tractable optimization problems, as 
will be seen in Sections |4] and [51 

4 A dynamic truck-routing algorithm 

This section describes an algorithm for intelligent operation of a fleet of R trucks, which 
move bikes between stations as needed. Their objective is to increase the system utility (as 
defined in Section [3]), and hence ultimately the system's service level, as much as possible. 

The problem of manual relocation of vehicles in shared vehicle systems is not new. It 
originated in pilot car-sharing projects, such as the French Praxitele |12II13| . Intellishare [14] 
and Honda ICVS [15] . However, the repositioning algorithms used in car-sharing projects do 
not translate directly to the public bicycle-sharing scheme considered in this paper. Firstly, 
each of these algorithms exhibits certain characteristics that are specific to its corresponding 
vehicle-sharing system, for example charging times of electric vehicles. Secondly, car-sharing 
systems tend to be much smaller in their network size than the PBS considered in our paper, 
and the proposed algorithms cannot easily be scaled to several hundred stations. 

Intelligent repositioning in bicycle-sharing schemes has also received prior attention in 
the literature [T]. The proposed approaches can be separated into static and dynamic 
approaches. 

In the static repositioning approach, an optimal route is computed in order to attain 
a predefined fill level for each station, prior to customers interacting with the system (e.g. 
during the night). For example, [16] present a solution for the routing of a single truck, 
and [T7] consider the case of multiple trucks. The advantage of this approach is that there 
is ample time to compute a good truck routing solution, and this solution could serve as a 
reliable basis for computing the price incentives (Section [5]). However, static repositioning 
has shown too little fiexibility to react to unforeseen variations in the demand pattern, 
caused for example by unexpected weather conditions. 

In the dynamic repositioning approach, the truck routing is planned in a receding horizon 
fashion while the system is in full operation. This allows the planning to react online to 
unexpected changes in the system's state. As such it is a more suitable approach to our 
problem. Rair and Miller-Hooks [TSl use a stochastic formulation for dynamic repositioning 
based on stationary distributions for customer arrivals and departures. However, their 
approach is not suitable to the case of our PBS, since these distributions vary considerably 
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according to the distinct daily flow patterns described in Section [TJ Contardo at al. [TH] 
present another dynamic repositioning approach with time- varying, yet deterministic future 
flow patterns. But the computational complexity of their approach is prohibitive for our 
system, because it is too expensive to simulate the system with multiple trucks over a long 
time horizon. 

Whilst many PBS schemes redistribute bicycles during the night, nighttime operation is 
restricted in London in several key areas [201 . Therefore, this paper considers the dynamic 
repositioning case only. It is a variation of the routing problem with pickups and deliveries 
for one commodity, taken from |21) . It is illustrated in Figure[5] In order to determine a truck 
route, time is discretized into 5-minute intervals and the planning problem is considered 
on the time-expanded network (Section 14. ip During each interval, customer behavior is 
assumed to be time-varying, but deterministic. A receding planning horizon is considered, 
which is the maximum of the truck visiting 4 stations and Ttiuck = 40 min. The period for 
re-optimizing of the truck routes ( "implementation horizon" ) is chosen as Tlmpi = 30 min, 
based on a trade-off between computation time and performance quality. Note that as shown 
in Figure ini the planning horizon is longer than the re-optimization cycle. This improves the 
performance of truck journeys beginning shortly before the next re-optimization as these 
journey are likely to end within the longer planning horizon where subsequent opportunity 
is still considered. 

To solve the routing problem for a single truck (Section 14. 2p over the finite planning 
horizon, we adopt a two-step approach. First, for each truck we construct a tree of "promis- 
ing route candidates" using a greedy heuristic. This means that truck routes are extended 
by stations that promise high ratios of utility added per time to travel. Second, for each of 
the promising routes, the optimal number of bikes to be loaded or unloaded at each stop is 
optimized as the complete routes have become known. Then the route providing the highest 
utility improvement is selected. Finally, we extend this algorithm in a simple manner to 
multiple trucks (Section 143]) . 

4.1 Modeling repositioning truck routes on a time-expanded net- 
work 

We now describe how the truck routes can be modeled as a time-expanded network on a 
graph G — {V,A) [inill^, which is the basis of our truck-routing algorithm. The vertices 
V of the graph consist of tuples w = {(s,i), s G S*, i £ T} £ y|f| The (expected future) fill 
level / for each vertex v (i.e. station s at time t) is generated according to ([8]). The arcs 
a = (wi, V2) £ A of G correspond to possible journeys a truck is able to take. 



^The notation s{v) is used to access the component s of the tuple v = (s, t). 
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Figure 6: Illustration of the dynamic truck-routing algorithm for i? = 2 trucks. The time 
axis is discretized into 5 min-intervals. The planning horizon is the maximum of A'truck = 4 
stations visited by the truck and Ttruck = 40 min; the implementation horizon is Timpi = 
30 min. Each line represents the route of one truck, where journeys are indicated by a solid 
line if they start within the implementation horizon (i.e. they are definitely executed) or by 
a dashed line if they start with the planning horizon (i.e. they may be subject to change at 
future re-plannings) . The trucks wait at each stop for 5 min in order to to load or unload 
bikes. The number of bikes loaded and unloaded at each stop is indicated as well. 

In our model, the time it takes for a truck to traverse an arc is discretized to multiples 
of 5 minutes. It is computed based on the Euclidean distance (in km) douci(si, Sj) between 
two stations Si, Sj G S, assuming an average speed of 15km/h for the truck in city traffic. 
Including an additional 5 min for bicycle handling after reaching the station, the resulting 
effective journey time (in time steps of 5 min) for a truck to go from station Si to Sj is 



d{s^,Sj) := [dcuci(si,Sj)/1.25] +1. 



(10) 



Note that the dividing factor of 1.25 results from converting distance (in km) into time steps 
(of 5 min). The repositioning trucks have limited operation hours during the day, set to 
7 am — 10 pm. All trucks are constrained to start at a maintenance depot in the morning, 
and also to finish at this depot at the end of the working day. As a consequence, vertices 
from which no combination of arcs leads back to the depot on time are excluded from the 
graph. An example of a network of stations and a corresponding time-expanded network 
are shown in Figures [7] and [H 
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Figure 8: Example time-expanded network. 
Figure 7: Example station graph, with Dark-grey vertices are marked "dead" due to the 
arcs weighted according to distance. terminal condition. 

4.2 Computing single-truck routes 

In this Section, we present an algorithm for finding a repositioning route for a single truck 
in the dynamic case. The amount of time available for repositioning is assumed to be fixed. 
In contrast to the static repositioning case, the goal is not to attain a defined system state 
as little resources as possible, but rather to optimally invest the available resources, i.e. 
operational hours of the trucks. Thus, based on the utility of fill level changes defined in 
Section [31 the ratio of added utility per invested time of a truck r is maximized. 



Constructing a tree of promising candidate routes Every truck r € R can hold 
a maximum load Zmax = 20 bikes and holds /J" e {0, . . . , /max} bicycles at time t. The 
goal is to determine a truck's repositioning actions pi = {vi, Afi,l^) € P : Vi g V, /,; G 
{0, . . . ,/max}, J e {!:■•■ , -Struck}- The truck load II is defined as the number of bicycles 
after the repositioning action A/^ performed at the station and time indicated by the vertex 
of the time-expanded network v{pi). In particular, for every pair of actions Pi,Pi+i there 
must exist an arc a ^ A with 



vi{a)=Vi A V2{a)^Vi+i, Vi £ {1, . 



,M 



truck 



!}■ 



(11) 



Moreover, the following consistency constraints for station fill levels and truck loads must 
hold: 

/.t) = 4-)-i + ^^^ e [0' ^-x^] , V* G {1, . . . , iVti-uck} (12a) 



4 + 1 



/[ - A/,+ 1 e {0, . . . , /,„ax}, V^ £{!,..., TVtruck - 1} 



(12b) 



Starting from an initial repositioning position pi, the possible next steps can be rep- 
resented by a tree graph $, where each node (j) represents a specific repositioning action 
and each leaf node determines a unique truck route. The tree of all possible routes has a 
branching factor of 15 — 1|, since a route could possibly lead to any of the other stations for 



14 



the next repositioning action. So there are \S — 1| Struck -i possible combinations of stations 
for a route of length A'truck (where the initial position pi is already known). For systems 
consisting of several hundred stations it is thus not viable to test every possible combina- 
tion. The complexity of the problem is reduced by concentrating on a subset of possible 
routes corresponding to the most promising repositioning actions. It works by constructing 
a pruned version of the route tree. Starting from the trucks initial position (at the root), the 
tree is recursively extended at each of its leaf nodes until it has reached the desired height 
of iVtruck (or the journey has a minimum duration of Ttruck): 

1. The current leaf node is = {v^, Af^,l^). First, we compute the "value per unit 
distance" the truck might bring by going to any of the other stations. The set 
V :— {v ^ V : 3a G A,vi{a) — v^,V2{a) — v} contains the vertices of the time- 
expanded network the truck would reach by going to any of the other stations. Since 
we know the current load of bicycles Z^, the best action to be performed at v G T^ 
can be computed with a "greedy" approach as 



\r{v,<f>) = 












max i t A ^ tniaxi 


ftiv) - /(«) 


). 


if .f{v) > /(«) 


min (Z;, 


rs(v) 

_Lt{v) 


/(«)J 


)> 


if ./(«) < m 


0, 










else. 



(13) 



We choose the K vertices with the best ratio Af*{v,(p)/d(vci,,v) and add them as 
leaves of (j) in the form of route steps. The set of new leaf nodes is $0. 
2. In addition, we add stations that could serve as an intermittent depot. Going there 
may not yield a direct utility. But the possibility to bring or take bicycles may be of 
use at other stations of the route. We choose 



Vsto 



arg max 

veV:s(v)=is(v^) 



^pick 



arg max 



[ldcpot,ft{v) - f{v) 

d{v^,v) 



min (^Idcpot, f{v) ~ if^lj 



d{v^,v) 



(14a) 



(14b) 



and add them to the set of leafs $0 with a repositioning action of zero. To prevent 
that VstoreiVpick St-Te Only set to very large stations, we cap the maximum intermit- 
tent depot size considered to some ^dopot f; 'max- How stations that may serve as 
intermittent depots are incorporated into the actions at other steps of the route is 
explained in the following. 
3. If the depth of the recursive procedure has not yet reached the final depth TVtruck, 
it is repeated for every cp' e $0. Before entering the recursive procedure at 0', the 
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corresponding repositioning action Af^> is incorporated into the predicted future fill 
levels of the time-expanded network. These changes, of course, have to be unwound 
between the 4>' G $0. 
If even more aggressive tree pruning is necessary to comply with computational con- 
straints, the similar Beam Search P^TI^ can be applied. It leads to linear complexity in the 
route length, but at the expense of discarding more potentially optimal solutions. In Beam 
Search, a greedy approach is used to determine promising next steps as well. But only K 
leaf nodes are added in total to all nodes of the same height. Resorting to Beam Search 
was, however, not necessary for the route length horizon used in the sample setting of this 
paper. 

Refining truck loading actions The repositioning actions pi, i — 1,. . . , A^truck of the 
promising routes candidates in $ stem from a greedy heuristic that could not know about 
stations visited later in the route. Knowing the complete routes, their respective action 
profile should be further refined. As a motivating example, it may be beneficial to pick up 
more bicycles than the utility function of a single station u{f, A/) (see Section|3]) originally 
indicated. That is, if taking more bicycles has zero utility locally (the fill level remains within 
the utility plateau) , but a subsequent stations in the route can make use of the additional 
bicycles. The problem of choosing optimal actions can be formulated as a manageable 
quadratic program (QP). 



s{i),t{i) 

h 

A/. 

A/(*),A7(i) 



A/'«,A/(z) 



,iV, 



truck 



}■ 



^0; '^niax 



9>2Z 



2 
max 



1 



The station and time at the i-th step in the route for i G {1, 
The expected fill level of station s{i) at time i(i). 
The action performed at step i. This is the optimization variable. 
Beginning and end of the utility plateau of station s(i), as described in 
Section |31 

Difference between the new fill level fi -t- A/i and the plateau begin- 
ning/end ft{i),f'll^y The difference is defined to grow positively going 
outwards from the respective side of the plateau. 

Auxiliary variables containing the absolute difference from the plateau 
beginning A/'(i) = |A/(i)| or end A/ (i) = |A/(i)|. They are correctly 
set by the solver minimizing costs within the bounds set in (J16dp and 

dUil). 

Starting fill level of the repositioning truck and the maximum truck load 

capacity. 

Scaling factor for penalizing repositioning actions. 

min £ A/(z) + Afiz) + A7(*) + Af' (t) + £ A/f /g (15) 
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such that 

i 

0<;[„-^A/,. </„,ax, Vie{l,...,7Vtruck} (16a) 

i' = l 

Al{i) = m-f,-Af,, Vze{l,...,7Vt,uck} (16b) 

A7(i) = -7(i) + /. + A/„ Vze{l,...,7Vt,,,k} (16c) 

-A/'(z) < A/(*) < A fit), V* £ {1, ... , TVt.uck} (16d) 

-Aj'ii) < AJit) < Aj'it), Vz e {1, . . . , TVtruck} (16e) 

The linear part of the objective function (jlSp evaluates repositioning actions according 
to the utility definition of Section [3] The quadratic part of (jlSp minimizes the action of the 
truck operators (they take the fewest bicycles possible). It is scaled such that actions with 
a positive utility will still be performed; However, it prevent actions causing negative utility 
at one station and the equal positive utility at another step in the route. This also ensures 
that stations will not be pushed outwards from the plateau and actions remain feasible. 
Equation (|16ap ensures that the fill level of trucks stays within the capacity constraints. 
If computation time allows, an integer constraint A/„ G Z, \fn can be added to reflect 
the discrete number of bikes. This renders the optimization problem into a mixed-integer 
quadratic program (MIQP). In our approach, we manually fit the solution to the truck load 
constraints based on the relaxed QP solution by clipping any non-integer parts. In test runs 
no or only very little differences from the MIQP were observed. 

We now determine the best set of repositioning actions for all promising routes and 
choose the route that results in the best overall utility increase per unit time. 

4.3 Routing multiple trucks 

Co-optimized routes for several trucks are too difficult to compute online within the time 
constraints of the PBS system. Therefore, we resort to optimizing multiple truck routes 
sequentially where actions of prior trucks can be treated as known. These actions are 
manifest in the future fill levels stored in the graph of the time-expanded network. 

The basic idea is to repeat adding route steps for each truck until it reaches a minimum 
number of A'truck steps, or a journey time of Ttiuck- This is performed sequentially, starting 
with the truck who has the minimum time-index for the last step in his route and has 
not yet reached the required route length. The reason for this sequential procedure is to 
prevent the collision of two truck routes, which can be detrimental to the performance of 
the algorithm as explained below. Assume, without loss of generality, that the trucks r E R 
are ordered according to the current computation of their routes. If truck r chooses to 
go to a station s{v) to which a truck r' < r has already planned to go at a later time 
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t{v') > t{v), s{v') — s{v), then r' has made his choice based on false assumptions about the 
station's fill level. These collisions can be handled by 

• Removing all routing steps that were added during the last route-search step from 

• Removing all but the first routing steps that were added during the last route-search 
step from fl^- 

So collisions are prevented and since at least one new routing step is preserved per 
detected collision, so our algorithm will eventually reach Ttruck for all trucks. 

5 Dynamic price incentives for users 

Customers themselves might contribute to the rebalancing of a PBS scheme if offered an 
appropriate payment. In this paper we consider how payments could be offered to customers 
to change the endpoint of their journey to a nearby station in a way that improves the overall 
service level. To this end, we take the model of how customers accept (or choose between) 
price offers, as described in Section [2. 3[ and then form an optimization problem trading off 
the expected payouts and the expected improvement in service level. The solution of this 
optimization problem is a set of price offers that are presented to any customer arriving at 
a given station. It seems reasonable to reduce the complexity of this optimization problem 
by limiting the number of prices offered (= decision variables) to 10 per station; i.e. for each 
station, only 10 prices are quoted for going to selected neighboring stations. 

We assume that means of communicating the price incentives and for making payments 
are available. A payment infrastructure is already central to existing systems, like the Oyster 
Card for London's public transport network. New information capabilities could be added to 
the kiosk terminals used for rental, and/or the mobile applications many customers already 
use. 

Using price incentives to induce a desired behavior in the users of shared mobility systems 
has been examined in multiple contexts in recent literature. The work of |14j examines user- 
based repositioning in a shared mobility system. However, the approach of splitting and 
merging rides can only be applied to cars and not to public bike hire schemes. Incentives 
for bike sharing schemes are investigated by [24] . where users pick two stations at random 
and go to the more empty. 

In this paper, we propose a novel scheme that is based on Model Predictive Control 
(MPC). A short summary of MPC is given in Section [STTI In Section [S^ we then show how 
the customer reaction model from Section [2] can be linearized in order to obtain a tractable 
MPC problem formulation. Finally, Section [5751 explains the details of the corresponding op- 
timization problem that has to be solved in a receding-horizon fashion in order to determine 
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the real-time price incentives. 

5.1 Model Predictive Control 

Here we provide a short introduction to Model Predictive Control (MPC [25]), giving the 
basic explanations required to describe the controller developed in Section [5.31 The funda- 
mental idea is to employ a model of a given system in order to optimize the inputs given to 
the system over a finite control horizon. Only the first input is then applied to the system, 
and the scheme continues by measuring the new state of the system and solving another 
finite- horizon optimization problem ("MPC problem"). 

The two main aspects of MPC comprise good control decisions for the system with 
respect to anticipated future events (by the optimization), and feedback in the case of 
unforeseen disturbances or model inaccuracies (by re-optimizing periodically for the control 
actions). An important strength of MPC is its ability to incorporate a model of the system 
dynamics and to handle constraints on the states and control inputs. For example, in the 
case of the PBS this is advantageous because it allows upper and lower bounds to be placed 
on the computed price incentives (control inputs) and the stations' fill levels (system state). 

Let the system state at time step t (i.e. a vector containing the fill levels of all stations) 
be denoted by x{t). The future state evolves as some function of the current state and the 
control inputs u{t) (i.e. a vector of the price incentives offered to the users), so that the 
subsequent state is given by x{t -I- 1) = ft{x{t),u{t)). Here ft represents only a simplified 
model of the actual system dynamics, which is subject to model uncertainty and distur- 
bances. Note that the function ft depends on the predicted customer interaction and is 
therefore time- varying (recall that customer interaction shows time- varying patterns). The 
actual deviations from the predicted customer interaction is uncertain, and thus considered 
as a disturbance to our model. Moreover, given that the truck routes are known (from 
Section!?]), the functions ft also contain changes to the system state affected by manual 
repositioning. 

Assume the current time to be t = without loss of generality. Now we wish to choose a 
finite series of inputs u(t) where t — 0, . . . ,T ~ 1 so that the system behaves optimally over 
the finite time horizon T, starting from the measured current state a;(0). This is done by 
solving an optimization problem, trading off the perceived cost of having suboptimal system 
states cf (x(i)), and the cost of applying the control input c"(u(i)): 

T T-l 

min yc^{x{t)) + V c^{u{t)) (17) 

u{0),...MT~l) ^ ^ 

such that 

xit + l)^ft{x{t),u{t)), t = 0,...,T-l (18a) 
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u{t)€Ut, < = 0, ...,T-1 (18b) 

x{t) e Xt, t = l,...,T (18c) 

where the functions c^ and c^ are caUed stage costs for the state and input respectively, and 
sets Xt and Ut represent any constraints that may be present on the state and input. 

Although solving problem ([T7)) gives a series of inputs u{t),t = 0, . . . , T — 1, only m(0) 
is applied to the system. In the next time step, a new measurement of the current state 
is made, and a new series of planned control inputs are determined by resolving the MPC 
problem in light of the new information. For this reason, MPC is also known as "Receding 
Horizon Optimal Control" . 

To make problem (|17[) tractable, the system model must often be simplified. In partic- 
ular, non-linear dynamics ft{x{t),u{t)) make the problem non-convex. For many systems, 
though, good control performance can still be achieved if the model is linearized. 

5.2 Simplified model of customer behavior 

We now derive an approximate model of customer behavior that can be used in the context 
of MPC. Customer behavior means their response to price incentives, as described in ([7]) and 
therefore enters into the system dynamics ft{x{t), u(t)). However this response is nonlinear, 
which as described in the preceding section leads to a non-convex MPC problem (|17p . 

We first explain the origin of this nonlinearity. Consider two neighbor stations n',n" G 
Ns with equal distance to s, for which the incentives offered from station s are equal, 
Ps,n' ~ Ps.n" ■ If there are customers equally willing to go to n' or n", an infinitesimally 
small increase in Ps.n' would cause all those customers to choose n' if we assume they act 
totally rationally. The customer reaction to incentives is thus discontinuous in the prices, 
and the true behavior model ([7]) will not lead to a tractable optimization problem. 

To formulate a tractable MPC problem we approximate 7r(s, n,ps) in a linear fashion and 
choose a convenient set Ng of N nearby neighbors for each station, so that \Ns\ — N. The 
linearization 7f(s,n,ps) is computed using Algorithm [21 which creates samples of customer 
reactions to random incentive offers (to all neighboring stations) and a performs a least- 
squares fit between observed behavior and the linear model. The linear dependency on 
offered incentives, where customers reject station s and go to neighbor n instead, is defined 
by vectors TTg „ of size N, for each s ^ S,n £ Ng. 

Tr{s,n,ps) =7tJ Ps (19) 
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Algorithm 2 Fitting the linear customer behavior model 



Require: s E S, 

1: Ns, \> \Ns\ — N nearest neighbours around station s 

2: taken_incentive(s, A^s,p), i> Neighbor chosen by the customer as described in 

Section [m 
3: Pmax, > Maximum payout 

4; P 3> 0, > Number of generated payout vectors (samples) 

5; C 3> 0, > Number of customer behavior samples 

6: fl > Set of samples. Each sample is a tuple of two vectors: The offered payouts p to the 

N neighbours and the percentage of customers taking a certain incentive S. 
7; for i = 1 to P do 

8: p ^ Pmax ' RAND(iV) G [0,p,nax]^ > PayOUt VCCtor 

9: e ■(— {0}^ i> Initialize behavior count 

10: for c = 1 to C do 

11: n' <— taken_incentive(s, A^s,p) e {N+,0} 

12: e„' <- e„' + 1 

13: end for 

14: S ■<— e/C > Fraction taking a certain incentive 

15: n{i) <- {p,5) 

16: end for 

17: for n = 1 to A^ do 

18: TTs^n ^ arg mmY,ie{i....,p} {■^^^{'i)p - f^(i)5,«) 

7r 

19: end for 
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5.3 Computing dynamic price incentives 

In this subsection we formulate an MPC problem, the solution of which gives the price 
incentives Ps{t) that should be offered to customers for i = 0, . . . , T — 1, where Ps{0) are the 
prices to be issued immediately, and Ps{l-), ■ ■ ■ ,Ps(T — 1) are prices planned for subsequent 
steps. 

The number fs (t) of bikes present at station s at time t evolves according to the original 
arrival rate As (i) and net change ris (t) described in Section [21 along with a modification 
7(s, Xs{t),ps{t)) due to customers taking price incentives and another, Afs{t) due to trucks 
adding or taking away bikes from the stations. Ns denotes the set of stations having s as 
one of their nearest neighbors. 

'y(s,Xs{t),ps{t)) = ^ Tr{n,s,pA{t)) ■ Xn{t) 

nGN. (20) 

fsit + 1) = fs{t) + ry,(i) + 7 (s, X{t),p{t)) + A/s(t). (21) 

Note that X]ses'^(*' '^«(^)'P«(^)) ~ ^ since the total number of bikes in the system must 
be constant. Also, the controller assumes that customers who take an incentive go from 
their originally intended destination to the new one within the same time step. We do this 
to make the MPC problem easier to solve, and assume it does not distort predictions of 
customer actions too much. 

We now specify the components of MPC problem (|17p . Using the linearized model of 
customer reactions to incentives from Section 15.21 and defining quadratic stage costs cf 
and c", the MPC problem becomes a quadratic program. Under the assumption of a linear 
customer response to prices, the expected payouts are a quadratic function of the prices, and 
the input cost in the MPC problem represents a real monetary cost to the system operator. 
The state cost aims to penalize loss of customer service. The resulting MPC problem is a 
quadratic program (QP) and can be stated as follows: 

-^priceO -^pricc^-'-O 

min^^Q,(t)/,(i)^+ Y. E^^Wp^W' (22) 

pit) ^ — ' ^ — ' ^ — ^ ^ — ^ 

'^^ ' t=l s t=0 s 



such that 



/, (i) = /, (t) - i (/; + /;) , Vs e 5, Vi (23a) 

fsit+l)^Mt)+7^,it)+Afsit) 

+ E i^lsPAit)) Aft(i) 
nGNs 
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^ {nl^Psit)) Kit), Vs e S, Vi (23b) 

J2 ^J.nPsit) < 1, yseS,Vt (23c) 



neNs 

Vs e S*, Vi, 

0<p.,„(i)<Pmax, (23d) 

ne{l,...,N} 

The cost weights Qs{t) and i?s(i) in the cost function ((22) are used to penahze deviation 
f{t) from the optimal state i(/^ + ft), and the cost caused by the incentives payout, 
respectively. A weighting factor a is used to adjust the relative costs associated with having 
stations deviate from their optimal point of operation in the middle of the utility plateau 
(which we assume to lead to a lower service level), and cash payouts: 

Q.(t) = l/(7:-/;), (24) 

Rs{t) = aJ2^lnUt)- (25) 

A lower value of a leads to a lower relative penalty for paid incentives, likely leading to 
higher price incentives applied. 

Equation (|23ap transforms the number of bikes at each station to a quantity measured 
relative to the "best" fill level, the middle of the station's utility plateau. The predicted 
system states within the horizon are defined by (j23b[) . It includes the expected arrival and 
departure rates as well as the linearized model of customer behavior. Equation (|23cp limits 
the payouts such that no more than 100% of arriving customers take an incentive to one of 
the neighbours, and ()23d[) ensures that payouts arc at most Pmax- 

6 Simulation 

6.1 Simulation setting 

Based on the assumptions and the system model developed in Section [21 a Monte-Carlo 
simulation is used to compare the two approaches for bike repositioning. It is important 
to note that although simplified models of the system are used to choose truck actions 
and prices, we use the full model derived from historical data as described in Section [2] 
to simulate the actual behavior of customers. We simulate first a sequence of weekdays, 
then a sequence of weekend days, bearing in mind that demand patterns differ significantly 
between the two. Every simulation run consists of a 24h burn-in period starting from the 
initial system configuration, in order to reduce the dependence of our results on this initial 
configuration. Then, three consecutive days are simulated, for which the statistics gathered 
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Incentives payout in Pound 







Number of trucl^s 



Figure 9a: Service level for weekdays, as a 
function of number of trucks and total pay- 
outs. 




Incentives payout in Pound 



Number of trucks 



Figure 9b: Service level for weekend days, 
as a function of number of trucks and total 
payouts. 



are presented below. In accordance to 
Sam-lOpm. 



], redistribution with trucks is performed during 



6.2 Simulation results 

The resulting service level is computed as follows: 

. Potential customers — No-service events 
Service level — 



(26) 



Potential customers 

When simulating three consecutive weekdays, about 49,800 potential customers are gener- 
ated on average, and for three consecutive weekend days (e.g. a Bank Holiday weekend) 
about 29,900. The number of total no-service events is the sum of customers who could not 
rent a bike at an empty station and customers who wanted to return their bike at a full 
station. 

We varied number of trucks used for repositioning and the level of price incentives given 
out (via the choice of state cost weight a in the price controller). Figure [5] shows how the 
service level reported by the simulations varied as a result. 

As expected, adding more trucks as well as paying out more in incentives has a posi- 
tive effect on the service level. However, with increasing service level, adding trucks and 
incentives payouts becomes less efficient. Comparing the two simulations, it appears that 
the usage peaks caused by commuters were responsible for most of the service shortfalls 
observed. Most events where a customer could not be served were concentrated on only a 
few stations. 

Figures [TU] and [TT] show the split of no-service events into "empty events" (where cus- 
tomers wanting to rent a bike arrive at an empty station) and "full events" (where customers 
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Figure 10a: No-service events for different 
numbers of repositioning trucks (no incen- 
tives) for three consecutive weekdays 



Figure 10b: No-service events for different 
numbers of repositioning trucks (no incen- 
tives) for three consecutive weekend days 
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Figure 11a: No-service events during sim- 
ulations for weekdays with repositioning 
trucks. Over the course of 72h ca. 49,800 
potential customers arrive. 
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Figure lib: No-service events during simu- 
lations of weekend days with repositioning 
trucks. Over the course of 72h ca. 29,900 
potential customers arrive. 
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wanting to return a bike arrive at a full station). Since the number of full events is consid- 
erably lower than the number of empty events, it seems plausible that adding more bikes 
could have a positive effect on the service rate. 

7 Conclusions 

This paper considered how a Public Bicycle Sharing scheme could be managed using a 
combination of intelligently routed repositioning trucks and redistribution incentives for 
customers. The truck routes and price incentives were computed using model-based receding 
horizon optimization principles, which took account of expected future customer behavior. 
As the number of trucks was increased, diminishing gains to service level were reported 
for added trucks and customer incentive payouts. Customer payments were shown to be 
a means of reducing service shortfalls, particularly when few repositioning trucks were in 
operation. 

Our results suggest that price incentives are viable for repositioning bicycles in a PBS 
when the commuting rush hour is less prominent. For the London PBS, price incentives 
alone were shown to be enough to keep the service level above 87% on weekends without 
the use of staff. On weekdays however, when many customers use the PBS to commute to 
work, price incentives alone are not sufficient to lift the service level substantially. 

The price control algorithm could be developed further in several ways. Firstly, a field 
trial could be used to improve the accuracy of the customer decision model upon which our 
controller is based. This would reveal the range of price elasticities customers exhibit, and 
also indicate to what extent customer responses to prices are irrational. Secondly, some 
simplifying assumptions (for example deterministic customer arrival, linearized customer 
reaction to incentives) could be replaced by more detailed models. However, it is unsure 
how much performance can be gained here, as the prediction horizon for optimization might 
have to be shortened considerably to account for the increase in computational complexity. 
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